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1 A survey of processors with explicit multithreading 
Theo lingerer, Borut Robic, Jurij Silc 
March 2003 ACM Computing Surveys (CSUR), volume 35 issue l 

Full text available: * g] pdf(920.16 K3) Additional Information: full citation, abstract, references, citings, index terms 

Hardware multithreading is becoming a generally applied technique in the next generation of 
microprocessors. Several multithreaded processors are announced by industry or already 
into production in the areas of high-performance microprocessors, media, and network 
processors. A multithreaded processor is able to pursue two or more threads of control in 
parallel within the processor pipeline. The contexts of two or more threads of control are 
often stored in separate on-chip register sets. Unused i ... 
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3 Architecture and systems: Teieport messaging for distributed stream programs Q 
William Thies, Michal Karczmarek, Janis Sermulins, Rodric Rabbah, Saman Amarasinghe 
June 2005 Proceedings of the tenth ACM SIGPLAN symposium on Principles and 
practice of parallel programming 

Full text available: ^)pdf{352 .12 KB) Additional Information: full citation, abstract references, index terms 

In this paper, we develop a new language construct to address one of the pitfalls of parallel 
programming: precise handling of events across parallel components. The construct, 
termed teieport messaging, uses data dependences between components to provide a 
common notion of time in a parallel system. Our work is done in the context of the 
Synchronous Dataflow (SDF) model, in which computation is expressed as a graph of 
independent components (or actors) that communicate in regular ... 
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A vast body of theoretical research has focused either on overly simplistic models of parallel 
computation, notably the PRAM, or overly specific models that have few representatives in 
the real world. Both kinds of models encourage exploitation of formal loopholes, rather than 
rewarding development of techniques that yield performance across a range of current and 
future parallel machines. This paper offers a new parallel machine model, called LogP, that 
reflects the critical technology tre ... 

Keywords: PRAM, complexity analysis, massively parallel processors, parallel algorithms, 
parallel models 



6 Experience Using Multiprocessor Systems— A Status Report Q 
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Prakash Linga, Indranil Gupta, Ken Birman 

October 2003 Proceedings of the 2003 ACM workshop on Survivable and self- 
regenerative systems: in association with 10th ACM Conference on 
Computer and Communications Security 

Full text available: ^pdfj(1.07 MB? Additional Information: full citation, abstract , references 

Denial of service attacks on peer-to-peer (p2p) systems can arise from sources otherwise 
considered non-malicious. We focus on one such commonly prevalent source, called 
"churn". Churn arises from continued and rapid arrival and failure (or departure) of a large 
number of participants in the system, and traces from deployments have shown that it can 
lead to extremely stressful networking conditions. It has the potential to increase host loads 
and block a large fraction of normal insert and lo ... 
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Milo M. K. Martin, Mark D. Hill, David A. Wood 

May 2003 ACM SIGARCH Computer Architecture News , Proceedings of the 30th 

annual international symposium on Computer architecture, volume 31 issue 2 
Full text available: ^p„dft26£08.KBj Additional Information: full citation, abstract .references, citings 

Many future shared-memory multiprocessor servers will both target commercial workloads 
and use highly-integrated "glueless" designs. Implementing low-latency cache coherence in 
these systems is difficult, because traditional approaches either add indirection for common 
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cache-to-cache misses (directory protocols) or require a totally-ordered interconnect 
(traditional snooping protocols). Unfortunately, totally-ordered interconnects are difficult to 
implement in glueless designs. An ideal coherenc ... 
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Mark Stemm, Randy H. Katz 

December 1998 Mobile Networks and Applications, volume 3 issue 4 
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terms 

No single wireless network technology simultaneously provides a low latency, high 
bandwidth, wide area data service to a large number of mobile users. Wireless Overlay 
Networks - a hierarchical structure of room-size, building-size, and wide area data 
networks - solve the problem of providing network connectivity to a large number of mobile 
users in an efficient and scalable way. The specific topology of cells and the wide variety of 
network technologies that comprise wireless o ... 

11 A survey of processors with explicit multithreading 
Theo Ungerer, Borut Robic, Jurij Silc 

March 2003 ACM Computing Surveys (CSUR), volume 35 issue l 

Full text available- "Mpdfi'920 16 KB] Additional Information: full citation , abstract, references , cjtioos, index 
^ ~ v terms 

Hardware multithreading is becoming a generally applied technique in the next generation 
of microprocessors. Several multithreaded processors are announced by industry or already 
into production in the areas of high-performance microprocessors, media, and network 
processors. A multithreaded processor is able to pursue two or more threads of control in 
parallel within the processor pipeline. The contexts of two or more threads of control are 
often stored in separate on-chip register sets. Unused i ... 

Keywords: Blocked multithreading, interleaved multithreading, simultaneous 
multithreading 
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13 On the partitionability of hierarchical radiosity 
Robert Garmann 

October 1999 Proceedings of the 1999 IEEE symposium on Parallel visualization and 
graphics 
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The Hierarchical Radiosity Algorithm (HRA) is one of the most efficient sequential 
algorithms for physically based rendering. Unfortunately, it is hard to implement in parallel. 
There exist fairly efficient shared-memory implementations but things get worst in a 
distributed memory (DM) environment. In this paper we examine the structure of the IIRA 
in a graph partitioning setting. Various measurements performed on the task access graph 
of the HRA indicate the existance of s ... 
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Evolutionary design of complex software (E DCS) LCtemQn.stration.days..1.9-99. H 
Wayne Stidolph 

January 2000 ACM SIGSOFT Software Engineering Notes, volume 25 issue l 
Full text available: ^odfM.90 MB) Additional Information: full citation, abstract, index terms 

This report summarizes the Product/Technology demonstrations given at Defense Advanced 
Research Projects Agency (DARPA) Evolutionary Design of Complex Software (EDCS) 
Program Demonstration Days, held 28-29 June 1999 at the Sheraton National Hotel, 
Arlington, VA. 

15 Papers: Tactile user interface: Haptic techniques for media controi 

Scott S. Snibbe, Karon E. MacLean, Rob Shaw, Jayne Roderick, William L. Verplank, Mark 
Scheeff 

November 2001 Proceedings of the 14th annual ACM symposium on User interface 
software and technology 

Full text available: pdf(1.05 MB) Add '*ional Information: full citation, abstract, references, citings, index 
• i „ terms 

We introduce a set of techniques for haptically manipulating digital media such as video, 
audio, voicemail and computer graphics, utilizing virtual mediating dynamic models based 
on intuitive physical metaphors. For example, a video sequence can be modeled by linking 
its motion to a heavy spinning virtual wheel: the user browses by grasping a physical force- 
feedback knob and engaging the virtual wheel through a simulated clutch to spin or brake 
it, while feeling the passage of individual frames. ... 

Keywords: Haptic force feedback, interaction techniques, media browsing, multimedia 
control, tangible interfaces, user interface design, video editing 
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David Kotz, George Cybenko, Robert S. Gray, Guofei Jiang, Ronald A. Peterson, Martin O. 
Hofmann, Daria A. Chacon, Kenneth R. Whitebread, James Hendler 
April 2002 Mobile Networks and Applications, Volume 7 issue 2 

Full text available: fl pdffiBZJS KB) Additional Information: Ml citation, abstract, references, citings, index 
™" terms 

Wireless networks are an ideal environment for mobile agents, since their mobility allows 
them to move across an unreliable link to reside on a wired host, next to or closer to the 
resources that they need to use. Furthermore, client-specific data transformations can be 
moved across the wireless link and run on a wired gateway server, reducing bandwidth 
demands. In this paper we examine the tradeoffs faced when deciding whether to use 
mobile agents in a data-filtering application where numerous ... 

Keywords: RPC, information filtering, mobile agent, mobile code, performance analysis, 
wireless network 



17 Missing the memory wall: the case for processor/memory integration 
Ashley Saulsbury, Fong Pong, Andreas Nowatzyk 

May 1996 ACM SIGARCH Computer Architecture News , Proceedings of the 23rd 

annual international symposium on Computer architecture, volume 24 issue 2 
Full text available: sdfi1.45 MB! Additional Information: MLcitation, aMract, references, cjtinss, index 

terms 

Current high performance computer systems use complex, large superscalar CPUs that 
Interface to the main memory through a hierarchy of caches and interconnect systems. 
These CPU-centric designs invest a lot of power and chip area to bridge the widening gap 
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between CPU and main memory speeds. Yet, many large applications do not operate well 
on these systems and are limited by the memory subsystem performance.This paper argues 
for an integrated system approach that uses less-powerful CPUs that are ... 

18 STiNG: a CC-NUMA computer system for the commercial marketplace | 
Tom Lovett, Russell Clapp 

May 1996 ACM SIGARCH Computer Architecture News , Proceedings of the 23rd 

annual international symposium on Computer architecture, volume 24 issue 2 
Full text available' j f) pdfi'1.30 MB! Additional Information: full citation, abstract, references, citings, index 

"STiNG" is a Cache Coherent Non-Uniform Memory Access (CC-NUMA) Multiprocessor 
designed and built by Sequent Computer Systems, Inc. It combines four processor 
Symmetric Multi-processor (SMP) nodes (called Quads), using a Scalable Coherent Interface 
(SCI) based coherent interconnect. The Quads are based on the Intel P6 processor and the 
external bus it defines. In addition to 4 P6 processors, each Quad may contain up to 4 
GBytes of system memory, 2 Peripheral Component Interface (PCI) busses for ... 

19 Performance prediction of parallel processing systems: the PAMELA methodology j 
Arjan J. C. van Gemund 

August 1993 Proceedings of the 7th international conference on Supercomputing 

Full text available: f a pdfftOS MB) Additional Information: full citation, abstract, references, dfings 1 index 
' *® x terms 

In this paper we present a new methodology for the performance prediction of parallel 
programs on parallel platforms ranging from shared-memory to distributed-memory 
(vector) machines. The methodology comprises a procedural program and machine 
specification paradigm based on PAMELA (PerformAnce ModEling LAnguage), along with a 
performance calculus, called "serialization analysis". This calculus extends conventional 
parallel program analysis technology by explicitly accounting fo ... 

20 Performance analysis.,^, mobile agents .for filtering data streams on wirejess networks | 
David Kotz, Guofei Jiang, Robert Gray, George Cybenko, Ronald A. Peterson 

August 2000 Proceedings of the 3rd ACM international workshop on Modeling, analysis 
and simulation of wireless and mobile systems 

Full text available* pdf(847 89 KB) Additional Information: full. .citation, abstract .references, citings, index 
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Wireless networks are an ideal environment for mobile agents, because their mobility allows 
them to move across an unreliable link to reside on a wired host, next to or closer to the 
resources they need to use. Furthermore, client-specific data transformations can be moved 
across the wireless link, and run on a wired gateway server, with the goal of reducing 
bandwidth demands. In this paper we examine the tradeoffs faced when deciding whether 
to use mobile agents to support a data-f ... 
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